Part 1: Currency Conversion Analysis

Background

In this problem, we will study fluctuations in currency exchange rate over time.

File USD-JPY.csv download contains the daily exchange rate of USD/JPY from January 2000 through May 31st 2022. We will aggregate the data on a weekly basis, by taking the average rate within each week. The time series of interest is the weekly currency exchange. We will analyze this time series and its first order difference.

Instructions on reading the data

To read the data in R, save the file in your working directory (make sure you have changed the directory if different from the R working directory) and read the data using the R function read.csv()

Here we upload the libraries needed the this data analysis:

To prepare the data, run the following code snippet. First, aggregate by week:

We now form the weekly aggrgated time series to use for data exploration! Please note that we will analyze the weekly aggregated data not the original (daily) data.

Please use the jpy series to code and answer the following questions.

Question 1a: Exploratory Data Analysis

Before exploring the data, can you infer the data features from what you know about the USD-JPY currency exchange? Next plot the Time Series and ACF plots of the weekly data. Comment on the main features, and identify what (if any) assumptions of stationarity are violated.

Which type of model do you think will fit the data better: the trend or seasonality fitting model? Provide details for your response.

Response: General Insights on the USD-JPY Currency Rate

The data is the currency exchange rate of USD to JPY for a series of 22 years (2000 - 2022). We can expect fluctuation in the exchange rate due to changes in USA and Japan foreign policy as well as other global events. For this reason we do not expect the exchange rate to stay constant.

Times Series Plot

ACF Plot

Stationarity

Question 1b: Trend Estimation

Fit the following trend estimation models:

Overlay the fitted values on the original time series. Plot the residuals with respect to time for each model. Plot the ACF of the residuals for each model also. Comment on the four models fit and on the appropriateness of the stationarity assumption of the residuals.

Response: Comparison of the fitted trend models:

Visually, the splines regression follows the closest. All peaks and troughs in the data are captured. Further, the residuals are mainly concentrated around 0 indicating closer to stationarity. However, the dataset is still not stationarity as evidenced by the acf plot.

Contrary to this, the parametric quadratic polynomial only has one local and global minimimum and does not seem to capture the deviations in the dataset closely enough. There is clear sea

The moving average and loess approaches show significant deviations in residuals

Response: Appropriateness of the trend model for stationarity

All ACF plots of the residuals deviate from stationarity as evidenced by the large spikes in each away from 0 for each lags.

Question 1c: Differenced Data Modeling

Now plot the difference time series and its ACF plot. Apply the four trend models in Question 1b to the differenced time series. What can you conclude about the difference data in terms of stationarity? Which model would you recommend to apply (trend removal via fitting trend vs differencing) such that to obtain a stationary process?

Hint: When TS data are differenced, the resulting data set will have an NA in the first data element due to the differencing.

Response: Comments about the stationarity of the difference data:

After differencing the data the data appears to be somewhat stationary. All four models exhibit some seasonality; however, each are much closer to being stationary.

If the end goal is to achieve a stationary process, I would choose to difference the data rather than detrend. This selection is evidenced by the acf plots of the differenced data models that show the autocorrelation close to zero.

Part 2: Temperature Analysis

Background

In this problem, we will analyze aggregated temperature data.

Data Everest Temp Jan-Mar 2021.csv contains the hourly average temperature at the Mount Everest Base Camp for the months of January to March 2021. Run the following code to prepare the data for analysis:

Instructions on reading the data

To read the data in R, save the file in your working directory (make sure you have changed the directory if different from the R working directory) and read the data using the R function read.csv()

You will perform the analysis and modelling on the Temp data column.

Here are the libraries you will need:

Run the following code to prepare the data for analysis:

Question 2a: Exploratory Data Analysis

Plot both the Time Series and ACF plots. Comment on the main features, and identify what (if any) assumptions of stationarity are violated. Additionally, comment if you believe the differenced data is more appropriate for use in fitting the data. Support your response with a graphical analysis.

Hint: Make sure to use the appropriate differenced data.

Response: Comments about the time series and ACF plots of the original time series

The times series shows a plot that displays some seasonality and is overall, very noisy.

The ACF plot displays a times series that deviates from stationarity as evidenced by the large spikes in the autocorrelation function at all lag periods. THe ACF plot also shows some seasonal effects with larger acf at certain lags and a slight decreasing trend.

The differenced times series data shows a process that is closer to stationary than the non-differenced data. There I believe it is better suited for fitting the data. However, there does appear to be some trend given the large positive and negative spikes in the auto correlation function.

Question 2b: Seasonality Estimation

Separately fit a seasonality harmonic model and the ANOVA seasonality model to the temperature data. Evaluate the quality of each fit with residual analysis. Does one model perform better than the other? Which model would you select to fit the seasonality in the data?

The ANOVA model shows a closer fit to the seasonality of the data. This is further evidenced by the residual plot where the residuals are lower for the ANOVA plot compared to the harmonic model. I would choose the ANOVA model because the seasonality fit is closer than that of the harmonic model.

Question 2c: Trend-Seasonality Estimation

Using the time series data, fit the following models to estimate the trend with seasonality fitted using ANOVA:

Overlay the fitted values on the original time series. Plot the residuals with respect to time. Plot the ACF of the residuals. Comment on how the two models fit and on the appropriateness of the stationarity assumption of the residuals.

What form of modelling seems most appropriate and what implications might this have for how one might expect long term temperature data to behave? Provide explicit conclusions based on the data analysis.

Response: Model Comparison

The non-parametric model performs better as evidenced by the residual analysis. Both models deviate from stationarity where there are large spikes in the autocorrelation function in the acf plots of both residuals.